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Abstract 

The question of which costs admit unique optimizers in the Monge-Kantorovich problem 
of optimal transportation between arbitrary probability densities is investigated. For smooth 
costs and densities on compact manifolds, the only known examples for which the optimal 
solution is always unique require at least one of the two underlying spaces to be homeomorphic 
to a sphere. We introduce a (multivalued) dynamics which the transportation cost induces 
between the target and source space, for which the presence or absence of a sufficiently large 
set of periodic trajectories plays a role in determining whether or not optimal transport is 
necessarily unique. This insight allows us to construct smooth costs on a pair of compact 
manifolds with arbitrary topology, so that the optimal transportation between any pair of 
probility densities is unique. 


1 Introduction 

Let M and N be smooth closed manifolds (meaning compact, without boundary) of dimensions 
m and n > 1 respectively, and c : M x —)■ R a continuous cost function. Given two probabil¬ 
ity measures fi and v respectively on M and N, the Monge problem consists in minimizing the 
transportation cost 


/ c{x,T{x)) d^{x), (1-1) 

JmxN 

among all transport maps from /i to ix, that is such that Tj/u = v. K classical way to prove existence 
and uniqueness of optimal transport maps is to relax the Monge problem into the Kantorovitch 
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problem. That problem is a linear optimization problem under convex constraints, it consists in 
minimizing the transportation cost 


c{x,y)dj{x,y), 


fMxN 


( 1 . 2 ) 


among all transport plans between y, and v, meaning 7 belongs to the set n(/x, v) of non-negative 
measures having marginals y and ly. By classical (weak) compactness arguments, minimizers for 
the Kantorovitch problem always exist. A way to get existence and uniqueness of minimizers for 
the Monge problems is to show that any minimizer of (1.2) is supported on a graph. Assuming that 
c is Lipschitz and y is absolutely continuous with respect to the Lebesgue measure, a condition 
which guarantees this graph property is the following nonsmooth version [7] [24] of the TWIST 
condition 

yi) n D-c{-,y2) =0 Vyi 7^ 2/2 G N, \/x G M, 

where D~c{-,yi) denotes the sub-differential of the function x 1 —)■ c{x,yi) at x. In this case, it 
is well-known how to use linear programming duality to prove that the Kantorovich minimizer is 
unique, and that Monge’s infimum is attained [11] [17]. 

Examples of Lipschitz costs satisfying the nonsmooth TWIST are given by any cost coming 
from variational problems associated with Tonelli Lagrangians of class (gee [3]), like the square 
of Riemannian distances (see [21]). Those costs are never on compact manifolds such as M x N. 
As a matter of fact, any cost c : M x A^ —)■ R of class admits a triple x G M, yi G N,y2 G N, 
(take 2/1 with c{x,yi) = min{c(a;,-)} and 2/2 with c{x,y2) = max{c(a;, •)} ) such that 


violating the nonsmooth TWIST condition. Indeed, we shall show the following holds. 

Theorem 1.1 (Non-genericity of twist). Let c : M x N ^ [0,oo) be a east function of class 
Assume that dimM = dim A^ and 


3{x,y) G M X N such that 


d^c 

dxdy 


{x,y) 


is invertible. 


(1.3) 


Then there is a pair y, v of probability measures respectively on M and N which are both absolutely 
continuous with respect to the Lebesgue measure for which there is a unique optimal transport plan 
for (1.2) and such that this plan is not supported on a graph. The set of costs c satisfying (1.3) is 
open and dense in C‘^{M x Af;R). 

The conclusion of Theorem 1.1 implies that solutions for the Monge problem with smooth cost 
do not generally exist in a compact setting. The purpose of the present paper is to study sufficient 
conditions for uniqueness of the Kantorovitch optimizer, and to exhibit smooth costs on arbitrary 
manifolds for which optimal plans are unique, despite the fact that such plans are not generally 
concentrated on graphs. Some examples of such costs have been given in [13] [1] (see also [5]). 
However, if uniqueness is to hold for arbitrary absolutely continuous y and v on M and N, all 
previous examples which we are aware of that involve smooth costs have required at least one of 
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the two compact manifolds be homeomorphic to a sphere. Here we go far beyond this, to construct 
examples of such costs on compact manifolds whose topology can be arbitrary. Our main idea is 
to relate the uniqueness of the Kantorovitch optimizer to a multivalued dynamics induced by the 
cost which does not seem to have been considered previously. 

Before stating our results, we need to introduce some definitions. 

Denoting the non-negative integers by N = {0,1,2,...} and the positive integers by N* = 
N\{0}, we begin recalling the well-known notion of c-cyclical monotonicity. 

Definition 1.2 (c-cyclical monotonicity). A set S G M x N is c-cyclically monotone when for all 
I G N* and {xi, yi) G S for i = 1,..., I with xj+i = Xi, we have 

I 

[c(xj+i,j/i) - c{xi,yi)\ > 0. 

i=l 


For given /i, v and c, it is also well-known [12] that some closed c-cyclically monotone subset 
S G M X N contains the support of all optimizers to (1.2). Note that of course, any subset of a 
c-cyclically monotone set is c-cyclically monotone as well. We come now to the concepts which will 
play a major role. 


Definition 1.3 (Alternant chains). For each {x,y) G M x N assume c(x,-) and c{-,y) are differ¬ 
entiable. Fixing S G M x N, we call chain in S of length L>\ (or L-chain for short) any ordered 
family of pairs 

(^{xi,yi),...,{xL,yL)^ G 

such that the set 

|(xi,j/i),...(xl,j/l)} 

is c-cyclically monotone and for every I = 1,..., L — 1 there holds, either 


xi = xi+i and yi yi+i = yrain{Ld+ 2 } and 


^{xi,yi) = ^(xi,yi+i), (1.4) 


yi = yi+i and xi ^ xi+i = x^in{L,i-i- 2 } and 


^ia:i,yi) = ^{xi+i,yi). (1.5) 


The chain is called cyclic if its projections onto M and N each consist of L/2 distinct points, in 
which case L must be even with y^ = yi and xl Xi. 


Note the existence of any cyclic chain {{xi,yi),... (xL,yif)) permits the construction of an 
infinite chain {(x;, j//)}igN* by 


{xkL+uykL+i) ■= {xi,yi) Vfc > 1, VZ e {1,..., L}. 


Our first result is the following: 


( 1 . 6 ) 
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Theorem 1.4 (Optimal transport is unique if long chains are rare). Fix a cost c G C^(M x N). 
Choose Borel probability measures pi on M and v on N, both absolutely continuous with respect to 
Lebesgue, and let Ho denote the set of all optimizers for (1.2) on Let Eq C M x N be a 

a-compact set which is negligible for all 7 € Ho, and denote its complement by S := {M x N) \ Eq. 
Let Eao denote the set of points which occur in k-chains in S for arbitrarily large k. Then E^o 
and its projections {E^o) and {E^o) are Borel. If ^{E^o) = 0 for every 7 € Ho, then Ho is a 
singleton. 

Remark 1.5 (Extension to singular marginals). When c G C^’^, we can relax the absolute conti¬ 
nuity of p, and V in the preceding theorem provided neither concentrates positive mass on a c — c 
hypersurface. Here c — c hypersurface refers to one which can be parameterized in local coordinates 
as the graph of a difference of convex functions [28] [12] [Ij]. 

Corollary 1.6 (Sufficient notions of rarity). The condition 'y{Eoo) = 0 in the statement of the 
theorem, and therefore its conclusions, follow from either /r(7r^(Eoo)) = 0 or v{'k^{E oo)) = 0. 

If there is a uniform bound K on the length of all chains in M x N, then our theorem applies 
a fortiori with S = M x N and Eq = 0 , since E^c, = 0 - We shall see this occurs in many cases 
of interest, including for the smooth costs that we construct on compact manifolds with arbitrary 
topology. The bound K will depend on the topology. On the other hand, an obstruction to the 
uniqueness of optimal plans is the existence of a non-negligible set of periodic orbits. As shown 
below, such a property is not typical: it fails to occur for costs c in a countable intersection C of 
open dense sets. Such a countable intersection is called residual. 

Theorem 1.7 (Costs admitting cyclic chains are non-generic). When dimM = dim A^, there is a 
residual set C in C°°{M x A^;R) such that no cost in C admits cyclic chains, and for every cost 
cGC, there is a nonempty closed set ^ C M x N of zero (Lebesgue) volume such that 


d'^c 

dxdy 


{x,y) 


is invertible for any {x,y) G M x TV \ E. 


In the terminology of Hestir and Williams [16], the absence of cyclic chains is sufficient to 
define (formally) a rooting set whose measurability would be sufficient for uniqueness. We refer 
the reader to Section 3 for further details on their approach and its aftermath [5] [1] [23]. We do 
not know if uniqueness of optimal plans between absolutely continuous measures holds for generic 
costs. However, elaborating on a celebrated result by Mane [19] in the framework of Aubry-Mather 
theory, we are able to prove that uniqueness of optimal transport plans holds for generic costs in 
if the marginals are fixed. In C®, such a result was known already to Levin [18]. 

Theorem 1.8 (Optimal transport between given marginals is generically unique). Fix Borel prob¬ 
ability mesures on compact manifolds M and N. For each k G NU { 00 }, there exists a residual set 
C C C^iM X IV; R) such that for every c G C, there is a unique optimal plan between p and v. 


The paper is organized as follows. We provide examples of costs satisfying the above results in 
Section 2. We develop preliminaries on numbered limb systems and details on Hestir and Williams’ 
rooting sets in Section 3. We give the proofs of Theorem 1.4 in Section 4, of Theorem 1.7 in Section 
6 , and finally of Theorem 1.8 in Section A. 
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2 Examples and applications 

2.1 Quadratic cost on a strictly convex set 

Let us begin by recasting an example of Gangbo and McCann [13] into the framework of (alternant) 
chains. 

Fix N C Let M be the boundary of a strictly convex body fi C that is a closed 

set which is the boundary of a bounded open convex set and such that for any z, z' € M, 

[z, z'j C M z = z', 

where [z, z'] is the segment joining z to z' . We aim to show that for any measures [i and iz being 
absolutely continuous w.r.t. the Hausdorff m-dimensional measure "H™ measure on M), we have 
uniqueness of optimal plans for the cost 

c{x,y) = ]^\x-y\^ y{x,y)&MxN. 

Let V{M X N) denote the Borel probability measure on M x N and tt^ : M x N ^ M and 
TT^ : M X N ^ N the projections onto the first and second variables. Let y and v be probability 
measures on M and N. We recall that the support F C M x N of any plan 7 € V{M x N) 
minimizing 


inf|/ c{x,y)d'y{x,y)\-f &n{y,,v)\ (2.1) 

Ijmxv J 

is c-cyclically monotone, which in the case c{x,y) = \y — a;p/2 reads 

I 

'^{yi,Xi - x*+i) > 0, 
i=l 

for all positive integer I, i = 1,... {xi,yi) € A, xi+i = xi. The uniqueness of optimal plans 
will follow easily from the next lemma. 

Lemma 2.1 (Interior links are never exposed). Fix a hypersurface M C R™^^, possibly incomplete. 
For any submanifold N C of dimension n < m + 1 , let c{x,y) denote the restriction of 

^\x — yp to M X N . If ((xq, y), {x 2 , y), (x2, y'), (x4, y')) is a chain in M x N, then no hyperplane 
strictly separates X2 from M \ {X2}. 

Proof. To derive a contradiction, suppose ((xq, t/), (x 2 , y), (x 2 , y'), (x 4 , y')) forms a chain in M x 
yet X 2 is strictly separated from M \ {X 2 } by a hyperplane with inward normal 74 - 2 , i.e. 

(x — X 2 , 712 ) > 0 (2.2) 

for all X G M \ {X 2 }. The chain conditions imply y' — y = cxn 2 for some a G R. 
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On the other hand, pairwise monotonicity of the points in the chain imply 

{x 4 -X 2 ,y' -y) = a{x 4 -X 2 ,n 2 ) >0 

{x 2 - xo,y'- y) = a{x 2 -xo,n 2 ) >0. 

Using (2.2) we deduce a > 0 from the first inequality and a < 0 from the second. But a = 0 yields 
y' = y, contradicting the definition of a chain. □ 

As a consequence we have: 

Corollary 2.2 (Chain bounds for strictly convex hypersurfaces). If c is the restriction of \x — yp 
to M X N as above, where M C is a strictly convex hypersurface, then M x N contains no 

chain of length L > 5. Moreover, the projection of any 4-chain in M x N onto N consists of three 
distinct points, while its projection onto M consists of two distinct points. If N C is also a 

strictly convex hypersurface, then M x N contains no chain of length L > 4. 

Proof If a chain of length L > 5 exists, it begins either with 

{{xi,y 2 ), (X 3 , y 2 ), (X 3 , y 4 ), (X 5 , y 4 )) (2.3) 

or {(x 2 ,yi),{x 2 ,y 3 ),(x 4 ,y 3 ),{x 4 ,y 3 ),(xQ,y 3 )). Since M is strictly convex, each point x € M is 
exposed, meaning it can be strictly separated from M \ {x} by a hyperplane. In the first case 
Lemma 2.1 would be violated by the chain (2.3) since X 2 is an exposed point of M; in the second 
it would be violated by the chain ((x 2 ,y 3 ), (x 4 ,y 3 ), (x 4 ,y 5 ), (x 6 ,y 5 )) since X 4 is an exposed point 
of M. We are forced to conclude that no chain of length L > 5 can exist. Moreover, any chain of 
length L = 4 in M X N must take the form ((x2,yi), {x2,y3), (3:4,2/3), (3:4,2/5)) hence project onto 
three points yt G N. The yi must all be distinct since yi 7 ^ ys ^ ys from the definition of chain, 
while ys = yi would make the chain cylic, in which case it can be extended to an infinite chain 
(1.6) contradicting non-existence of a chain of length 5. The projection onto M therefore consists 
of the two points X 2 x 4 , which are distinct by the definition of chain. 

If C R'"^^ is also a strictly convex hypersurface then by symmetry, M x N can contain no 
chain which projects to more than two points on M and two points on N, hence no chain of length 
L>4. □ 

Example 2.3. Let us consider the example of the lake that already appeared in [13] and [7]. Let 
M = N be the unit circle in the plane, that is the circle centered at the origin of radius 1 equipped 
with the quadratic cost c{x,y) = \y — a:p/2. Consider a small auxiliary circle centered on the 
vertical axis, for example the circle centered at ( 0 , —5/2) of radius 1 / 8 , denote by i[ the distance 
function to the disc D enclosed by the small circle (see Figure 2.3). By construction, i[ is convex 
and differentiable at every point of M with a gradient of norm 1. 

Then we set 

ifix) := ip{x) — 'ix G M 

and 

/■(y) := min|^/(x)-|-c(x, y) I X G m| Vy € M. 
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D 


Figure 1: The lake 


By construction, we check that 

V’(x) = max|(/)(i/) — c(x, j/) I y G m| Vx G M. (2.4) 

Moreover, for every x G M, the gradient y{x) := 'Vx'ip G belongs to the set dctp{x) C M of 
optimizers for (2.4). As a matter of faet, we have by convexity of if, 

ip{x') — ip{x) > {y{x),x — x) Vcc, x G R^. 

Which can be written as 

1 2 1 2 

'f’{x)<if{x') + -\x' — y{x)\ —-\x — y{x)\ Vx,x'gR^. 

Taking the minimum over x' G M, we infer that (y(x) G M) 

c(x,y{x)) < 4>{fy{x)) — ip{x) yx G M 
whieh, because (f — if < c implies that 

c{x, y{x)) = (f{y{x)) - if{x) Vx G M, 

whieh means that y{x) = Vxif always belongs to dcif{x). For every x G M, we set 


y{x) := y{x) + \{x)x G M, 
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where X{x) > 0 is the largest nonnegative real number A such that y(x) + \x belongs to M (in 
other terms, y{x) is the intersection of the open semi-line starting from y{x) with vector x if the 
intersection is nonempty and y{x) = y{x) otherwise). For every x € M, the point y{x) belongs to 
dc'ip{x) as well. As a matter of fact, by convexity of M, the fact that the normal to M at x is x 
itself and the convexity ofif, we have for every x,x' G M, 

{y{x), x' — x) = {y{x),x' — x) X{x){x, x' — x) < {y{x),x' — x) < ipix') — 'f’ix). 

Proceeding as above we infer that y{x) belongs to dc'ffx). We can check easily that for every point 
X close to the south pole (—1,0) the points y{x),y{x) are distinct (see Figure 2.3). Proceeding as 
in the proof of Theorem 1.1, we can construct an example of optimal transport plan which is not 
concentrated on a graph. 

2.2 Quadratic cost on nested strictly convex sets 

Let 

rji c 1^2 c ... c (2.5) 

be a nested family of strictly convex bodies with differentiable boundaries in Set M = 

where Ui = dVLi \ dkli-i is an embedding of (a portion of) the unit sphere. 



Figure 2: Nested convex sets 


Lemma 2.4 (Chain length bounds for nested strictly convex boundaries). If c{x,y) denotes the 
restriction of ^\x — yp to manifolds Ml,N C and Ml := dXli U ... U dflr is a union of 

boundaries of a nested sequence (2.5) of strictly convex bodies Hi C R^^^, then Ml x N contains 




no chain of length AL + 1. Moreover, any chain of length AL has projeetions onto Ml (respectively 
N) which consist of 2L (respectively 2L + 1) distinct points. 

Proof. We prove the result by induction on L. Corollary 2.2 gives the result for L = 1. So assume 
that the property is proved for L > 1 and prove it for L + 1. Note that although Ml may not 
be a submanifold of (if the boudaries of and intersect), it may be regarded as 

embedding of the disjoint union Ui of potentially incomplete manifolds Ui = dfli \ d^i-i. 
Any chain in Ml+\ x N of length 4L + 5 takes one of the forms 

((a;i, 2 / 2 ), {X 3 , y 2 ), {X 3 , 2 / 4 ), • • ■, ix4L+3, 2 / 4 L+ 4 ), (a;4L+5, 2 / 4 L+ 4 ), (a;4L+5,2/4L+6)) or (2.6) 
i{x 2 , 2 / 1 ). {X 2 , 2 / 3 ), (X 4 , 2 / 3 ), • • • , (a;4L+4, 224L+3), (a;4L+4, 2/4L+5), {x 4 L+ 6 , 2/4L+5)) • (2-7) 

Strict convexity of dflL+i shows any x G dflL+i can be separated from Ml+i\{x} by a hyperplane. 
Lemma 2.1 therefore implies { 0 : 4 ,... ,X 4 l+ 3 \ C Ml, so that apart from possibly the first and last 
pairs of points, the chains (2.6)-(2.7) above are contained in Ml x N. But this contradicts the 
inductive hypothesis, which asserts that Ml x N contains no chain of length 4L + 1. 

Similarly, if Ml+i x N contains a chain of length 2L + 4, it must take the form of the first 
2L+4 points in (2.7) rather than (2.6); in the latter case {X 3 ,... ,X 4 l+ 3 } C Ml whence Ml x N 
would contain a chain of length 4L + 3, contradicting the inductive hypothesis. In the former 
case, {x 4 ,... ,X 4 L+ 2 } C Ml, whence Ml x N contains a chain of length 4L which the inductive 
hypothesis asserts is comprised of 2L distinct points := {x 4 ,Xq, ... ,X 4 l+ 2 } and 2L + 1 

distinct points := { 2 / 3 ,2/5) ■ • ■ j 2/4L+3}- Now X 2 and a; 4 L +4 both lie outside C Ml, 

since otherwise Ml x N contains a chain longer than 4L. Moreover X 2 7 ^ a: 4 L+ 4 , since otherwise 
we can form a cycle (of length 4L + 2 ), hence an infinite chain in Ml+i x N, contradicting the 
length bound already established. Similarly, 2/1 7 ^ 2/4i+5 are disjoint from , since otherwise 

we can extract a cycle and build an infinite chain in Ml+i x N. □ 

2.3 Costs on manifolds: 

Lemma 2.5 (Diffeomorphism from interior of simplex to punctured sphere). Fix the standard 
simplex A = {{to,... ,tm) | > 0 and YllLo^i — 1)1 ^ = -Bi(ei) C centered 

at Cl = (1, 0,..., 0) € R’"’*'^. There is a smooth map E : A —> dfl which acts as a diffeomorphism 
from A \ dA to dfl \ {0} such that E and all of its derivatives vanish on the boundary dA of the 
simplex: E{dA) = {0}. 

Proof. Let / : [0,1] —7 [1,2] be a smooth function satisfying the following properties: 

(a) / is nondecreasing, 

(b) f{s) = 1 for every s e [0,1/2], 

(c) /(I) = 2 and all the derivatives of / at s = 1 vanish. 

Denote by Dm the closed unit disc of dimension m and by S’” C the unit sphere. We also 

denote by expjy : Tj+S"^ —7 S’” the exponential mapping from the north pole N = (0, ...,0,1) 
associated with the restriction of the Euclidean metric in R’”+^ to S’”. Then we set 

fTT 1 

E(u) =expjv |^-/(|u|)uj yv G Dm. 
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By construction, F is smooth on Dm, -F is a diffeomorphism from Int(Fm) to S™ \ {S'}, where 
S denotes the south pole of S™, F{dDm) = (Sj and all the derivatives of F on dDm vanish. 
Therefore, in order to prove the lemma, it is sufficient to construct a Lipschitz mapping G : A —> 
Dm which is smooth on Int(A), is a diffeomorphism from Int(A) to Int(I?m), and sends dA to 
dDm- 

The simplex A is contained in the affine hyperplane 


H — < X — (to, • ■ • , tm) 


i=0 


ti = 1 


Let t:=(l/(TO + l),...,l/(m + l)) be the center of A, we check easily that A is contained in the 
disc centered at i with radius ^1 — 1 /(to + 1). For every t € A \ {f}-, we set Ut := (t — t)/\t — t\ 
and 

p{t) := min< s>0|A + SMt G dA j- Vt G A \ {t}. 


By construction, the function p : A \ {f}- —> [0, +oo) is locally Lipschitz and satisfies for every unit 
vector u G S™ n FIq (with FI = t + iJo), 


p(t + au) = Uu — a Va G (0, «„], 

where «„ > 0 is the unique a > 0 such that t+au G dA. We note that since Ac B{t, , 

we have indeed q;„ G (0, ^1 — l/(m + 1)] for every u G S™ n iLo- We also observe that the m- 
dimensional ball FfOB {i, l/(2(m + 1))) is contained in the interior of A and that p > 1/(2(to + 1)) 
on that set. Pick a smooth function g : [0, +oo) —)■ [0,1] satisfying the following properties: 

(d) g is nonincreasing, 

(e) g{s) = 1 — 3(m + l)s for every s G [0, l/(4(m + 1))], 

(f) g{s) = 0 for every s > l/(2(m + 1)). 

Let D be the m-dimensional unit disc in F[ centered at t, define the function G° : A —> Z? by 

G°(t) =i+[l- g (p(t))] {t-t)+g ip{t)) Ut. 

By construction, is Lipschitz and smooth on each ray starting from i. Namely, for each unit 
vector u G S'" n iLo, we have 

G°(a) := G° {t + au) = t + [1 — p («„ — a)] (au) + g («„ — a) u Va G (0,a„]. 

The derivative of on each ray t + u is given by 
dG^ 

-^(a) = [1 - 5 (a - a„) + (a - 1) g' {a - a„)] u Va G (0,0^^], 
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and there holds 


I- g{a- an) + {a-l) g' {a - a„) 

> I- g{a-au)+ 1 - ^ ^ ^ - 1^ g' (a - a„) 

by (d)-(f). Moreover, for every u G S™ n Hq, the ray i + R+ u in invariant by G°, G° (a„) =i+u, 
and in addition G°(t) = t whenever t G H Cl B (X, lj{2{m + 1))). In conclusion, G° is Lipschitz 
and bijective from A to -D. If we work in polar coordinates z = (a, u) with a > 0 and u G S’” H Hq , 
then G° reads 

G°(z)=G°(a,M) = (G„(a),M), 

for every z G A, the domain of in polar coordinates (since G° coincides with the identity near 
A we do not care about the singularity at a = 0). Thus for every z in the interior of A where is 
invertible, the Jacobian matrix of G° at z, JzG® is triangular and invertible. Recall that for every 
z in the interior of A, the generalized Jacobian of & at z is defined as 

JzG° := conv|lim J^^G® | Zk -^k z, G diff. at Zfe|. 

By the above discussion and Rademacher’s Theorem, for every z in the interior of A, HzG^ is always 
a nonempty compact subset of Mm(R) which contains only invertible matrices. In conclusion, for 
every t G Int(A) the generalized Jacobian of G° at t satisfies the same properties, it is a nonempty 
compact subset of Mm(R) which contains only invertible matrices. Thanks to the Clarke Lipschitz 
Inverse Function Theorem [9], we infer that the Lipschitz mapping G° : A —> U is locally bi- 
Lipschitz from Int(A) to Int(iJ). It remains to smooth G° in the interior of A by fixing G° on the 
boundary dA. 

To this aim, consider a mollifier 6 : R’” —)■ R, that is a smooth function satisfying the following 
three conditions: 

(g) 0 > 0 , 

(h) Supp( 6 <) C Dm, 

(i) /r™ 0{x)dx = l. 

The multivalued mapping A G Int(A) i-G JxG^ G Mm(R) is uppersemicontinuous (its graph is 
closed in Int(A) x Mm(R)) and is valued in the set of compact convex sets of invertible matrices. 
Hence, there is a continuous function e : Int(A) —)■ (0, oo) such that for every t G Int(A) and every 
matrix A G Mm(R), the following holds 

d {A, conv ({J/ 3 G I /? G B{X, e(t)})) < e{t) A is invertible. (2.8) 

Consider also a smooth function ix : H —>■ R+ such that: 
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(j) ^(t) = 0, for every t ^ Int(A), 

(k) 0 < iy{t) < min {d(f, (9A), e(f)}, for every t € Int(A), 

(l) for every t G Int(A), |Vt^| < e{t)/K, where A" > 0 is a Lipschitz constant for G^. 
Then, we define the function G : A —> D by (we identify Hq with R™) 


G{t) = ( 9{x) G° {t — v{t)x) dx Vt G A. 

By construction, G is Lipschitz on A, it coincides with G° on dA, it satisfies G(Int(A)) C Int(I?), 
and it is smooth on Int(A). For every t G Int(A), its Jacobian matrix at t is given by 


JtG = J 0{x) Jt_^(t)a;G° ■ (^In-x - dx. 

Hence, we have for every t G Int(A), 



9{x^ dx 


<K\Wtv\ 


and 


J 9{x) Jt-„(t)xG dx G conv|j;3G|/3 G 


Using (2.8) and (j)-(l), we infer that G is a local diffeomorphism at every point of Int(A). Moreover, 
G is surjective. If not, there is y G D such that y does not belong to the image of G. Since G = G° 
on dA, y does not belong to dD. Thus there is y' G dG{A)\dD. Since G is a local diffeomorphism 
at any preimage of y', we get a contradiction. In conclusion, G is a Lipschitz mapping from A to 
which sends bijectively dA to dD, which sends Int(A) to Int(iJ), which is surjective, and which is 
a smooth local diffeomorphism at every point of Int(A). Moreover, D is simply connected. Hence 
G : A —> is a Lipschitz mapping which is a smooth diffeomorphism from Int(A) to Int(iJ). We 

conclude easily. □ 


Proposition 2.6 (Smooth costs on arbitrary manifolds leading to unique optimal transport). Fix 
smooth closed manifolds M,N. Then there exists a cost c G G°°{M x N) such that: for any pair 
of Borel probability measures y on M and u on N which charge no c — c hypersurfaces in their 
respective domains, the minimizer of (2.1) is unique. 


Proof. Let m and n denote the dimensions of M and N, and assume m > n without loss of 
generality. Due to their smoothness, it is a classical result that both manifolds admit smooth 
triangulations [27] into finitely many (say km and fc^v) simplices (by compactness). 

For each k G {1,2,..., /cm}, dilating the map E of Lemma 2.5 by a factor of k induces a smooth 
map from the fc-th simplex of M to the sphere kdBi{ei) of radius k centered at (fc,0,... ,0) G 
R’T'+i. Taken together, these kM maps define a single smooth map Em ■ M —M where 
M = and Ofc = kBi{ei) C This map acts as a diffeomorphism from the union of 
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Figure 3: A bouquet of nested convex sets 


simplex interiors in M to M \ {Om+i} while collapsing their boundaries onto the origin Qm+i in 
R™+i. Set Mo := A^i(0„+i). 

Define the analogous map Ej^ : N —N C where N = Ufe=i^^^i(6i) C R"'*'^ 

and No = F;^^(0„+i). In case n < to, we embed R"+^ into R^^^ by identifying R"+^ with 

{(xi, . ■ • , I ^n+2 — ‘ ‘ — ^m+1 — 0}- 

The cost 

c{x,y) := IEm^x) - ENiy)\‘^/2 

on M X N then satishes the conclusions of the proposition. Its smoothness follows from that of Em 
and En. Lemma 2.4 shows that no chains of length greater than AkM lie in (M \ Mq) x {N \ Nq). 
On the other hand, the simplex boundaries Mq lies in a finite union of smooth hypersurfaces, 
hence are /i-negligible. Similarly, Nq is i^-negligible. The desired conclusion now follows from 
Theorem 1.4. □ 

3 Preliminaries on numbered limb systems 

3.1 Classical numbered limb systems 

The concept of numbered limb system was introduced by Hestir and Williams in [16]. Like Benes 
and Stepan [2], their aim was to find necessary and sufiicient conditions on the support of a joint 
measure to guarantee its extremality in the space of measures which share its marginals. 
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Definition 3.1 (Numbered limb system). Let X and Y be subsets of complete separable metric 
spaces. A relation S C X x Y is a numbered limb system if there are countable disjoint decompo¬ 
sitions of X and Y, 

OO OO 

X=\J / 2.+1 and Y=\J / 2 „ 

i—0 i—0 

with a sequence of mappings 


f2i ■■ Dom{f2i) CY —^ X and f2i+i ■ Dom{f2i+i) C X — >Y 


such that 


OO 

S = [j(^Graph{f 2 i-i) U Antigraph{f 2 i) 

i=l 


and 


Dom{fk)G Ran{fk+i) C Ik V/c > 0. 


(3.1) 


(3.2) 



Ii I3 I5 I7 I9 


Figure 4: A numbered limb system with = 10 
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The following statement from [1] extends and relaxes a result of Hestir and Williams [16]. Here 
Tr^{x,y) = X and TT^{x,y) = y. 

Theorem 3.2 (Measures on measurable numbered limb systems are simplicial). Let X and Y be 
Borel subsets of complete separable metric spaces, equipped with a-finite Borel measures p, on X 
and V on Y. Suppose there is a numbered limb system 

OO 

-S' = U (^Graph{f 2 i-i) U Antigraph{f 2 i)^ (3.3) 

i=l 

with the property that Graph{f 2 i-i) and Antigraph(f 2 i) are ^-measurable subsets of X xY for each 
i > 1 and for every 7 € r(/i, ix) vanishing outside of S. If the system has finitely many limbs or 
p[X] < 00 , then at most one 7 G r(/r, v) vanishes outside of S. If such a measure exists, it is given 
by J Ik where for every i > 1, 

( 72i-l = {idx'X f2t-l)^'n2t-l, 72i = (/2i X 

1 ^ 2 *-i = (m - Trf 72^) \Dom / 2.-1 ’ ^ 2 * ={ 1 ^- ttJ l 2 ^+l) \Domh, ■ 

Here yk is a Borel measure on Ik and fk is measurable with respect to the ijk completion of the 
Borel a-algebra. If the system has N < 00 limbs, 7 fc = 0 for k > N, and rjk and jk can be computed 
recursively from the formula above starting from k = N. 

The statement of Theorem 3.2 from Ahmad, Kim and McCann, like its antecedent in [16], give 
a sufficient condition for extrememality. It is separated from Benes and Stepan [2] and Hestir and 
Williams’ [16] necessary conditions for extremality by the 7 -measurability assumed for the graphs 
and antigraphs (which is satisfied, for example, whenever the graphs and antigraphs are Borel.) 
For sets S of the form (3.3) whose graphs and antigraphs fail to be measurable, there may exist 
non-extremal measures vanishing outside of S, as shown by Hestir and Williams using the axiom 
of choice [16]. Such issues are further explored by Bianchini and Caravenna [5] and Moameni [23], 
who arrive at their own criteria for extremality. Moameni’s is closest in spirit to the approach 
developed below based on chain length: he gets his measurability by assuming the existence of a 
measurable Lyapunov function to distinguish different levels of the dynamics. 


4 Proof of Theorem 1.4 and Remark 1.5 

Since the source and target spaces are closed manifolds and the cost c G C^, Gangbo and Mc¬ 
Cann [12] provide a c-cyclically monotone compact set 5 C M x TV and Lipschitz potentials 
f/T : M —> R and ^ : M —> R which satisfy 


%f{x) = max|(;i(j/) - c{x, y) | y G Tvj 

Vx G M, 

(4.1) 

4){y) = min|'0(x) -|- c{x, y) j a; G m| 

Vy G N, 

(4.2) 


15 



S C dcip := |(a;,y) G M X N\c{x,y) = (j){y) - ■i/'(a:)|, (4.3) 

such that any plan 7 G n(/x, v) is optimal if and only if Supp( 7 ) C S. Indeed, we henceforth S to 
be the smallest compact set with these properties. 

We recall that the c-subdifferential oi 'tp at x G M and the c-superdifferential ot (p at y G N are 
defined using (4.3): 

dc'ipix) := ^y G N\{x,y) G dcip'^ 
and d‘^(p{y) := ^x G M \ {x,y) G dc'tp'^ ■ 

Note that since both ip and (p are Lipschitz and y and v are both absolutely continuous with respect 
to Lebesgue, thanks to Rademacher’s theorem, ip and (p are differentiable almost everywhere with 
respect to /r and v respectively. Let Dom dip denotes the subset of M on which ip is differentiable. 
Following Clarke [8], for every x G M (resp. y G N), we denote by D*ip{x) and dip{x) (resp. 
D*(p{y) and d(p(ij)) the limiting and generalized differentials ot ip at x (resp. (p at y) which are 
defined by (we proceed in the same way with (p) 

D*ip{x) = Pk\Pk = dip{xk),Xk -G x,Xk G Domfi7/>| C T*M, 

and 

dip{x) = cow (D*ip{x)) C T*M. 

By Lipschitzness, for every x G M, the sets D*ip(x) and dip are nonempty and compact, and of 
course dip{x) is convex. The next three propositions are relatively standard; the lemmas which 
follow them are new. 

Proposition 4.1. For c G the potentials ip and <p of (4.1)-(4.2) satisfy: 

(i) The mappings x G M i-G dip{x) and y G N i-G d(p{y) have elosed graph. 

(a) For every x G M, ip is differentiable at x if and only if dip(x) is a singleton. 

(Hi) For every y G N, (p is differentiable at y if and only if d(p{y) is a singleton. 

(iv) The singular sets Mq := M \ Domdip and Nq := \ Domd(p are a-compact. 

Proof of Proposition f.l. Assertion (i) is well-known [8], and follows easily from the definitions of 
dip and d(p. Let x G M he such that ip is differentiable at x. From (4.1)-(4.3) we have 

Oc 

--^{x,y) = dip{x) WyGdcipix). (4.4) 

Argue by contradiction and assume that dip{x) is not a singleton. This means that D*ip(x) is not 
a singleton too, let p q be two one-forms in D*ip{x). Then there are two sequences {xk}k, {x^fk 
converging to x such that ip is differentiable at Xk and xj. and 

lim dip{xk) = p, lim dippx'jP) = q. 

fe—>-oo k—^oo 
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For each k, there are yk G dc'4’{xk), y'k € dc'4’{x’^) such that 

dc dc 

--^{xk,yk) = dtp{xk) and - —{xk.y'k) = dtp{x'k). 

By compactness of N, we may assume that the sequences {yk}k, {yk}k converge respectively to 
some y € dcip{x) and y' G dctp{x). Passing to the limit , we get 

which contradicts (4.4). On the other hand, if dipix) is a singleton, then ip is differentiable at x 
(indeed, dip : Dom dip —^ T*M is continuous at x, so x is a Lebesgue point for dip G 

(iv) The set of x such that dip{x) is a singleton is cr-compact because the multi-valued mapping 
X I—>■ dip{x) had s closed graph, and the mapping x i—)■ diam(5^/>(x)) is upper semicontinuous. For 
every whole number g, this implies those x with diam(5^/)(x)) > 1/q form closed subset of the 
compact manifold M. The singular set M \ Domdi/) is the union of such subsets over q= 1,2,.... 
cr-compactness of iV \ Douidcp = € N \ diam^dcpiy)) > l/g} follows by symmetry. □ 

Proposition 4.2 (Differentiability a.e.). The sets Mq := M \ Dom dip, Nq := N \ Dom d(p and 
Mq X Nq are a-compact, and /i(Mo) = v{Nq) = j{Mq x Nq) = 0 for every plan 7 G D(pL,v). 

Proof of Proposition f.2. Since S is compact, the cr-compactness of Eq follows from that shown in 
Proposition 4.1(iv) for Mq and Nq (a product of unions being the union of the products). If /i and 
V are absolutely continuous with respect to Lesbesgue, Rademacher’s theorem asserts pl[Mq\ = 0 
and i^INq] = 0 . Otherwise c G C^’^, in which case Gangbo and McCann show the potentials (p and 
—Ip are semiconvex [12], meaning their distributional Hessians admit local bounds from below in 
L°°. In this case the conclusion of Rademacher’s theorem can be sharpened: Zajicek [28] shows 
Mq and Nq to be contained in countably many c — c hypersurfaces, on which y and u are assumed 
to vanish. Finally ^{Mq x Nq) < ^{Mq x N) = ^{Mq) = 0 . □ 


Since our manifolds M and N are compact, any open subset is cr-compact; in particular the 
complement of S is cr-compact. In view of this fact and the proposition preceding it, by enlarging 
Eq if necessary we may henceforth assume (i) (M x N \ S) C Eq and (ii) Mq x Nq C Eq. Then 
S ■.= M X N\Eq ensures that for all pairs {x,y) G S := M x N\Eq we have differentiability of ip 
at x and of (p at y. 

Proposition 4.3 (Marginal cost is marginal price). For every {x,y) G S, (4.1)-(4.3) imply 


dip{x) 


dc 


and dp{x) 


(E 

dy 


{x,y)- 


Proof of Proposition f.S. Let {x,y) G S, then we have by (4.1)-(4.2), 


p{y') - c(x, y') < ip{x) Vy' G N and <p{y) - c{x, y) = ip{x) 
ip{x')c{x',y) > (p{y) Vx'G M and ip{x) + c{x,y) = (p{y). 

We conclude easily since both ip and (p are differentiable respectively at x and y. 


(4.5) 


□ 
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We call L-chain in S any ordered family of pairs 

(^{xi,yi),...ixL,yL)j CS^ 
such that for every I = 1,..., L — 1 there holds, either 

( XI = xi+i, ( yi = yi+u 

\ y/ 7^ y/+l “ ymin{L,Z+2} \ 7^ + l — ^min{L,Z+2} • 

Note that by construction, the set of pairs of any L-chain in S is c-cyclically monotone as a subset 
of S, so by (4.5), any L-chain in S is indeed an L-chain with respect to c (Definition 1.3). We 
define the level £{x,y) of each {x,y) € 5 to be the supremum of all natural numbers L G N* such 
that there is at least one chain {{xi,yi ),... (a;^, j/^)) in S of length L such that {x,y) = (xL,yL). 
Moreover, given a chain ((xi, j/i),... (x^, y^)) with L > 2 in 5, we say that (x^, y^) is a horizontal 
end if yL = J/l-i and a vertical end if xl = xl-i- We set 

5i:={(x,y)e5K(x,y)>L} VL G N*, 

and denote by (resp. 5)() the set of pairs {x,y) G Sl such that there exists a L-chain 
((xi, j/i),... (xL,yL)) in S such that (x,y) = (xL^yh) and yL-i = yL (resp. xl-i = xl)- Al¬ 
though projections of Borel sets are not necessarily Borel (see [25]), the following lemma holds. 

Lemma 4.4 (Borel measurability). The sets Si = S and 82 , 82 ,... ,S’l,S1 are Borel: each takes 
the form Up,q> where the sets Ui^i and Up-i^q C Up^q C Up^q-i are open for each p,q >2. 

Proof of Lemma 4-4- Given L > 3 odd, we shall show that S’f has the asserted structure. The 
other cases are left to the reader. Endow the manifolds M and N with Riemannian distances dM 
and dff, and let di, denote the product distance on the product manifold (M x N)^. For every 
integer p > 1, denote by Sp the set of L-tuples 

(^{xi,yi),...ixL,yL)j CS^ 

satisfying for every i = 1,..., L — 1, 

f for / even : xj+i = x; and dNiyi+i,yi) > ^/p, 

\ for I odd: yi+i = yi and dM(.xi+i,xi) > 1/p. 

Since S is compact, the set Sp is compact too. 

On the other hand, u-compactness of Lq yields (M xN)\Eo = fj^i for a monotone sequence 
of open sets Vq C V/_i . For every integer q> 1, we denote by S'' the open set of L-tuples 

(^(xi,pi),...(xl,pl)) C (Vq)^. 

A pair {x,y) G M x N belongs to S’f if and only if there is p > 1 such that 

(x, y) G ProJi (Sp n S')^ , 


18 



where the projection Projj;^ : (M x N)^ —> 5 is defined by 

ProJi {{xi,yi),... {xL,yL)) ■■= {xL,yL)- 

For integers p,q> 1, let the set of points which are at distance cIl < 1/q from Sp in (M x N)^. 
Since Sp is compact, for every p we have 

n(Spris;) = n(s;ns;). 

9 9 

Moreover since for every p, the sequence of sets {S'^nS''} is non-increasing with respect to inclusion, 
we have for every p, 

proji (n {s^p n ^0) = n n ^0 • 

V 9 7 9 

The open sets Up^q = Proj^ (S'^ n S'') then have the asserted monotonicities C/p-i,q C Up^q C C4>,q-i 
with respect to p and g, and we find = lj)^i fl^i ^p,q i® desired countable union of Gs 
sets. □ 

Corollary 4.5 (Borel measurability of projections). Fori > 1, the projections {Si) and TT^{Si) 
of Si (and of S(^,Sf if i > 1) take the form Vpq, where Vip and C Vp^q C Vp^q-i 

are open for each p,q >2. 

Proof If = Up Fq Uplq for i > 1 with U^^-y^q C Uplq C then setting Pp^, = TT^{Up^q) 

with Up^q = Up q U Uf q shows Tr^{Si) = Up Vp^q as desired. The other cases are similar. □ 

We recall that a set S C M x TV is called a graph if for every (x, y) & S there is no y' y such 
that {x,y') G S. A set S C M x TV is called an antigraph if for every (x, y) G S there is no x' 7 ^ x 
such that (x', y) € S. Any graph is the graph of a function defined on a subset of M and valued in 
N while any antigraph is the graph of a function defined on a subset of N and valued in M. We 
call Borel graph or Borel antigraph any graph or antigraph which is a Borel set in M x N. We are 
now ready to construct our numbered limb system. 

Motivated by the inclusion 5^+1 C Sk, we set Ei := Si \S 2 , 

( Ef- = E,\Sf, 

Afc := \ 5fc+i and E^—EkfiSf, E])- = Ek\S((, Vfc > 2. (4.6) 

[ E{(- := Ejt n Ef, 

Notice that Ek consists precisely of the points in S at level k. All these sets are Borel according 
to Lemma 4.4. Letting E^o '■= nZiSk gives a decomposition 


S = EaoU EiU 


( 00 

U(^5; 

k=2 


h— 

k 


UEjf^UEr) 


(4.7) 


of S into disjoint Borel sets. The next lemma implies the E{f are graphs and the Ef are antigraphs; 
El is simultaneously a graph and an antigraph, as are the E(('’. 
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Lemma 4.6 (Graph and antigraph properties), (a) Let {x, yt) G Ei and {x, yj) G Ej with j > i > 1 
and yi 7 ^ yj. Then i > 2. Moreover, if j > i then {x,yi) G E^ and {x,yj) G Eff^ so j = i + 1; 
otherwise j = i and both {x,yi), {x,yj) G Ef~. 

(b) Similarly, suppose {xi,y) G Ei and {xj,y) G Ej with J > i > 1 and Xi 7 ^ Xj. Then 


i = 2 and if j > i then (xi,y) G Ef and {xj,y) G E^j^-^so j = i + 1; otherwise j = i and both 


h— 


{x,yi), ix,yj) G E, 

Proof, (a) Let (x, yi) G Ei and {x, yj) G Ej with j > t > 1 and yi 7 ^ yj. Then [x, yi), {x, yj) form a 
2-chain and both points lie in 52 , forcing i > 2. If {x, yj) G Ej, there is a j-chain in S terminating 
in the horizontal end (x,yj). Appending {x,yi) to this chain produces a chain of length j + 1 
with vertical end {x,yi), whence i = i{x,yi) > j + 1. This contradicts our hypothesis i < j. We 
therefore conclude {x,yj) G Ej~. Note that if {x,yi) G E^, the same argument shows 


j = £{x,yj) > i -f 1 . 

Whether or not this is true, S contains a j-chain 

[(.x[,y[),...,{xj,y'j)^ 
terminating in the vertical end {x,yj), so 

and yj=y'^^yl_^. 


(4.8) 


(4.9) 


x = x'j = 


Now either (c) {x,yi) G E’f or (d) {x,yi) G Ef . In case (c), we claim y'j_^ = yi. Otherwise 
the sequence 

{{x'i,y'i), ■ ■ ■ ,{x'j_i,y'j_i),{x,yS) 

would be a j-chain in S of length j < £{x,yi) = i, contradicting (4.8). Thus y'j-i = Vi and 
i = £{x, yi) > j — 1, which implies equality holds in (4.8). 

In case (d), {x,yi) G Ef~, we replace (a;',y') with {x,yi) in (4.9) to produce a chain of length 
j < £{x, yi) = i, forcing i = j as desired. 

Part (b) of the lemma now follows from part (a) by symmetry. 


□ 


We dehne the graphs and antigraphs of our numbered limb system. 


Gi := 

El U Elf-, 




'll 

Ef-U Efr U Ef-^i 

= Ef, U Ef-^i, 

and 

(4.10) 

2i+l : — 

T7'^~ 1 1 ZPhv 1 1 Tph— 

^2i+2 ^ ^2i+l 

= ^2i-|-2 U L^2i+1 




G2 

for all integers t G N*, and adopt the convention Go = 0. 

Lemma 4.7 (Disjointness of domains and ranges). For fc G N set 


h = 


7 r^(GfcUGfc+i) if fc odd and 
7 '‘^(Gfe U Gk+i) if k even. 


(4.11) 


Then the subsets {/ 2 i+i}“o ^ ^xe disjoint, as are the subsets {/ 2 i}“i of N. 
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Proof. For z = N, we shall show the sets / 2 i+i C M are disjoint. Disjointness of the subsets 
of N is proved similarly, using Lemma 4.6(b). 

To derive a contradiction, suppose x € hi+i H l 2 j+i with z < j. Depending on whether z = 0 or 
z > 1, there exist {x,y) G EiU E 2 ~ U {x,y') € Since 2z + 2 < 2j + l 

disjointness of the Ek imply y 7 ^ y'. Lemma 4.6(a) then asserts {x,y') G ^ 2^+1 ^ 2^+2 — ^he 

desired contradiction. Thus the subsets {/ 2 i+i}“o ^ disjoint. □ 

Lemma 4.8 (Numbered limbs). The Borel sets {G2i+i}“o 0 / (4-10) form the graphs and {G 2 i}“i 
form the antigraphs of a numbered limb system: G 2 i+i = Graph(/ 2 i+i) and G 2 i = Antigraph(/ 2 i), 
with Dom/fc U Ran/fc+i C Ik from (4.11) for all k G {0,1,... ,K}. 

Proof. The sets Gk are Borel by their construction (4.6), (4.10) and Lemma 4.4. If z > 0 we claim 
G 2 i+i := E 2~^2 U ^ 2 i+i ^ 2 i+i S' graph: Let {x, y) 7 ^ {x, y') be distinct points in G 2 i+i. Lemma 

4.6(a) asserts that at least one of the two points lies in or E 2~^_2 — a contradiction. The fact 

that G^i is an antigraph follows by symmetry, and the fact that Gi is a graph is checked similarly. 

We can therefore write G 2 i+i = Graph(/ 2 i+i) and G 2 i = Antigraph(/ 2 i) for some sequence 
of maps fk : Dom/^ —Ran fk with domains Dom/z, C M and ranges Ran fk C N if k odd, 
and Domfk C N and Ran fk C M if k even. The fact that Dom/^ U Ranfk+i C Ik follows 
directly from (4.11), while Lemma 4.7 implies disjointness of the l 2 i+i C M and of the l 2 i C N. If 
M = M \ U“Q/ 2 i+i or N := N \ U“g/ 2 i is non-empty, we replace Iq by Iq U fV and Ii by /i U M 
to complete our verification of the properties of a numbered limb system (Definition 3.1). □ 

Proof of Theorem I .4 and Remark 1.5. To recapitulate: Gangbo and McGann [12] provide a c- 
compact set S containing the support of every optimizer 7 G Bq, and a pair of Lipschitz potentials 
(4.1)-(4.3) such that S C dcf}. We take S to be the minimal such set without loss of generality. 
Proposition 4.2 shows Mg := M \ Dom dif to be /i-negligible and Nq := N \ Dom d(j) to be 12- 
negligible; both are u-compact by Proposition 4.1. Without loss of generality, we therefore assume 
Mq X Nq C Eq and M x N \ S C Eq, the 7-negligible cr-compact set. Lemma 4.8 provides a 
decomposition (4.7) of S := M x N \ Eq into a numbered limb system consisting of Borel graphs 
and antigraphs — apart from a Borel set E^o = But we have ^{Eoo) = 0 for each 7 G fig 

by hypothesis. Theorem 3.2 therefore asserts that at most one 7 G Ilg vanishes outside S \ E^o- 
But since all 7 G Ilg have this property, Ilg must be a singleton. Finally, since C Sk we see 
T^^iEoo) = n^i and [E^o) are Borel using Corollary 4.5. □ 


5 Proof of Theorem 1.1 

Noting dimM = dim TV, let {x,y) G M x N be such that is invertible. The mapping 

Oc 

F : yGN ^ —{x,y)GT^M 

is G^ and since its differential at y is not singular, its image contains an open set in T^M. By 
Sard’s theorem (see [10, §3.4.3]), the image of critical points of F has Lebesgue measure zero, so 
we may assume without loss of generality that F{y) is a regular value of A, meaning there is no 
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y with -§^{x,y) singular such that F{y) = F{y). The next lemma then follows from topological 
arguments. 

Lemma 5.1 (Generic failure of twist). Fix {x,y) € M x N such that F{y) is a regular value of 
F(y) = ^{x,y). There is y G N such that F{y) = F{y), i.e. 

= (5.1) 

Proof of Lemma 5.1. We argue by contradiction and assume that 

yyGN, y ^y ^ F{y) F{y). 


Note that since F is a local diffeomorphism in a neighborhood of y, the above condition still holds 
if we replace F by F a smooth (of class C°°) regularization of F sufficiently close to F. So without 
loss of generality we may assume that F is smooth. Define the mapping G : N \ {y} —> by 


Giy) := 


Fjy) - Fjy) 
\Fiy)-Fiy)\ 


Vy e N\{y}. 


The mapping G is smooth, so by Sard’s Theorem it has a regular value A. Then the set 


G-\X) := {yGiV\{y}|G(y) = A} 


is a one dimensional submanifold of A^\ {y}. Moreover, since the differential of F at y is invertible, 
there are a open neighborhood G of y and a G^ curve 7 : [—e, e] —)■ with 7 ( 0 ) = y and 7 ( 0 ) 7 ^ 0 
such that 


G( 7 (±t))=±A Vfe(0,e] 


and 

G-i(A)nG = 7((0,e)). 


This shows that the closure of G“^(A) is a compact one dimensional submanifold whose boundary 
is y. But the boundary of any compact submanifold of dimension one is a finite set with even 
cardinal (see [ 22 ]), a contradiction. □ 


We need now to construct a c-convex function whose c-subdifferential at each point near x 
takes values near both y and y. We note that since F(y) is a regular value of F(y) = ||(x,y) and 
F(y) = F(y), both linear mappings DyF{y), DyF{y) are invertible. 

Lemma 5.2. There is a pair of functions ip : M ^ H, (p : N ^ IL such that 


and 


ip(x) = max 

yeN 


— c{x, y) > Vx G M 


(p{y) = min| 7 />(x) + c{x, y) | a; G m| Vy G N, 


(5.2) 


(5.3) 
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together with an open neighborhood U of x, two open neighborhoods V C N of y and V C N of y 
with V DV = 0, and two diffeomorphisms 

y:U^V y:U^V 


with 

y{x) = y and y = y{x), 

such that 

i9cV'(3;) U = |y(x), y(a;)| Wx^U. (5.4) 

Proof of Lemma 5.2. Since we work locally in neighborhoods of x, y and y, taking charts, we may 
assume that we work in R". For every symmetric n x n matrix Q, there is a function f : M ^ H 
of class such that 

x,y) (5.5) 


f)c 


and 


Hess®/ = Q. 


Let Q be fixed such that 


c cf^ e 


(5.6) 


(5.7) 


we claim that there is a c-convex function f) : M ^ Tl which coincide with / in an neighborhood 
of / and which satisfies the required properties. Since both {x, y) and g^g^ {x, y) are invertible 
and (5.5) holds with y ^ y, the Implicit Function Theorem yields a open neighborhood [7 C M of 
X, two disjoint open neighborhoods V ,V C N of y,y respectively, and two functions of class 


such that 


y(x) = y 
y(x) = y 


X GU I — y{x) G V, X G U I — y{x) G V 
^ix,yix)) = -d^f 


and 


§^ix,yix)) = -d^f 


Vx G U. 


(5.8) 


Taking one derivative at x in the latter yields 


d^c , 

^ ^ dydx 




which can be written as 


i(*) 

if(®) 


-(4®; (*'•')) 


-1 


-1 r 


Hessg/ + ^{x,y) 
Hessg/ + fy{x,y) 
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Therefore, by (5.6)-(5.7) we infer that and §|(a:) are invertible. Then restricting U,V,V if 

necessary, we may assume that the mappings 

X GU I — y{x) GV, X GU I — y{x) G V 

are diffeomorphisms. Moreover, the functions of class given by 

G : xGU I— f{x) - f{x) + c {x, y{x)) - c {x, y{x)) 

and 

G : xGU I—/(x) - f{x) + c (x, y(x)) - c (x, y{x)) 
satisfy (using (5.6)-(5.8)) 


<3 + |lb.s) 


G'(x) = G'(x) = 0, dxG = dxG = 0, Hess^G = HessgG = — 
so we may also assume that 

/ /(a;')-/(a;) + c(x',j/(x'))-c(x,y(x')) < 0 a U \ i t\ 'd'r a Tl 

\ /(x')-/(x) + c(x',y(x'))-c(x,y(x')) < 0 Vx e tv \ |x>, Vx e . 

As a matter of fact, freezing x in the first line of (5.9) and setting 

Gx{x') = f{x') - /(x) + c (x', y{x')) - c (x, y{x')) Vx' G U, 
we check that for every x G U, we have 

dc. 


< 0 , 


(5.9) 


G2,(x)=0, dxGx = dxf +-^{x,yix)) = 0 


and for every x' G U 


— c)c 

dx'Gx = dx'f + — (x',y(x')) + 






which implies 




Define the functions : N ^ H. and f/) : M —> R by 

f {y~^{y))+c{y~^{y),y) if yGV 

My) ■= { f (Miy))+c{Miy),y) if yefo 

—oo if y ^ V UV 


Vj/ G iV 


and 


^/>(x) = m^|())o(?/) — c(x,?/)| Vx G M. 
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We observe that we have for every x € U, 

<l>o{y{x)) - c{x,y{x)) = (l)o{y{x)) - c{x,y{x)). 

Then we have 

-tpix) = inax|(/)o(y(a:')) - c(x', j/(x'))| = inax|(/)o(y(a:')) - c{x',y{x'))^ Vx G M. 

By the above construction and (5.9), we have for every x € U and any x' € U \ {x}, 

^0 {y{.x)) -c{x, y{x)) = f{x) > fix') + c{x', yix')) - c{x, yix')) = (j)o (y(x')) - c(x, y(x')) ■ 
We infer that 


ipix) = (j>o{y{x)) - c{x,yix)) = (j)o{yix)) - c{x,y{x)) = f{x) Vx G U. 


Setting 


(l>iy) = min|i/'(a;) + c(x, y) | x G m| Vy G fV, 
we check that (5.2)-(5.4) are satisfied. 

'tpix) = ni^|(/)(y) — c(x,y)| Vx G M. 


□ 


Returning to the proof of the second case, let us consider an absolutely continuous probability 
measure /i on M whose support is contained in U. Then define the nonnegative measures v, v on 
iVby ^ ^ 

V ■■= and V := -yj/i, 

and set 

V := V v. 


Since the functions y and y are diffeomorphism, v is an absolutely continuous probability measure 
on N whose support in contained in V UV. Moreover, the plan 7 defined by 

i ■=\ {Id,y)^y+ ^ ild,y)^y 

has marginals /i and v and is concentrated on the set of {x,y) G M x N with x G U and y G 
dc'f’ix) n (fo U V). By (5.2)-(5.3), any plan 7 with marginals y. and u satisfies 


/ c(x,2/)d7(x,y) > / [(l){y) - tpix)] d-fix,y) 

JmxN JmxN 


Hy) dviy) - / ^(x) dy{x) 


IN 


! M 


<t>iy) dviy) - / tpix) dy{x) 


ivuv 


c{x,y)d'yix,y), 


'MxN 
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with equality in the first inequality if and only if 7 is concentrated on the set of {x,y) G M x N 
with X G U and y G dcip{x) H (y U y). This shows that 7 is the unique optimal plan with marginals 
fj, and v. 

It remains to show that the set of costs satisfying (1.3) is open and dense in C‘^{M x iV; R). The 
openness is obvious. Let us prove the density. Let c be fixed in C‘^{M x fV; R) such that (1.3) is 
not satisfied. Let f € {0,..., n — 1} be the maximum of the rank of ^j^ix, y) for (a;, y) G M x N, 
pick some {x,y) G M x N such that 


rank 



= r. 


Since the rank mapping is lower semicontinuous, there are two open sets U G M and V G N such 
that the rank of -§^{,x,y) is equal to r for any {x,y) G U xV. Moreover restricting U and V if 
necessary and taking local charts, we may assume that we work in R". Let X : V —>■ R" be the 
mapping defined by X{y) = ^{x,y), for any y G V. Doing a change of coordinates in x and y if 
necessary, we may assume that the f x r matrix 


G = 



1 < 2 , j < r 


is invertible. Define the mapping G : D —)■ R" by 


G{yi,---,yn) = {X{y)i,... ,X{y)r,yf+i,... ,yn) Vy G V. 

The function G is of class and by construction the differential of G at y is invertible. Then G 
is a local diffeomorphism from a open neighborhood V' G V oi y onto an open neighborhood Z of 
2 := G(y). The function X va z coordinates is given by 

X{z):=X^{G-^{z)) 'izGZ. 

By construction, we have 

X{z)i = Zi Vz = 1,..., f, V 2 : G Z. 

Therefore, since X has rank f, the coordinates (^Xf+i, ■ ■ ■ ,Xn^ do not depend upon the variables 
Zf+i, ... ,Zn- Let S : R” x R” —> R be the smooth function defined by 

n 

S{x,z)= XiZi Vx,zGR" 

i=f-\-l 


and let ip : R" —)■ [0,1] be a cut-off function which is equal to 1 in a neighborhood of G(y) and 0 
outside Z. Then for every e > 0 the function 

c : (x, z) I—^ c (^x,G~^{z)') + e(p{z)6{x, z) 

has a mixed partial derivative which is invertible at (x, z) and tends to c (in (x, z) coordinates) in 
G^ topology as e > 0 goes to zero. 


26 



6 Generic costs in smooth topology 

The proof of Theorem 1.7 follows by classical transversality arguments. We refer the reader to [15] 
for further details on the results from Thom transversality theory that we use below. 

Recall that dimM = dimfV = n. Denote by J‘^{M x iV; R) the smooth manifold of 2-jets from 
M X N toll and denote by V the set consisting of 2-jets {{x,y), X,p, H) where H is a symmetric 
matrix consisting of four n x n blocks 





with H 2 of corank > 1. The set V is closed and stratified by the smooth submanifolds 

Vr := ^((x,y),X,p,H) | rank(i72) = rj Vr = 0,... ,n - 1, 

of codimension > 1. By the Thom Transversality Theorem (see [15, Theorem 4.9 p. 54]), the set 
Cl of costs c € C°°(M X N;R) such that j'^c{M x N) is transverse to V is residual. For these 
costs the set E := (j^c)“^(D) C M x N is stratihed of codimension > 1 and it is nonempty. As a 
matter of fact, for every x G M, the mapping ^{x,-) : N ^ T*M is smooth and its image I is a 
compact subset of T*M. Thus for every boundary point p G dl, the function §^(x, •) cannot be a 
local diffeomorphism in a neighborhood of any y G N such that §^ix, y) = p, which shows that for 
such y the linear mapping gyQ^ {x, y) cannot be invertible. This shows that E is not empty. The 
fact that E is stratihed of codimension > 1 (and so of zero measure) comes from the fact that it is 
the inverse image by j^c : M x N ^ J^{M x A^;R) of V which is transverse to j'^(M x N) (see 
[15, Theorem 4.4 p. 52]). 

Using a similar argument, we next show that the set of costs without periodic chains is residual 
in C°°{M X IV; R). 

Lemma 6.1 (Cyclic chains yield optimal alternatives). Let 

((a;i, yi), •. • {xl, yL)) G {M x N)^ 

he a chain with X 2 = xi,xl ^ xi, and yh = yi- Then L = 2K for some integer K >2 and 

K-l K-l 

c(x2fc+i, y2fc+i) = c(a;2fc+i,y2fc-i-2)■ 
k=0 k=0 

Proof of Lemma 6.1. We have for any fc G {0, AT — 1}, 

a; 2 fc-i -2 = a; 2 fc-i-i and y 2 k+s = y 2 k+ 2 - 
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Then, since the set {{xi^yi),... {xL,yL)} is c-cyclically monotone, we have 


K-l 


K-l 


k=0 


c(x2k+l,y2k+l) < c(x2k+l,y2k+3) 

k=0 

K-l 

= y^ c(x2k+l,y2k+2) 


k=0 

K-l 


< 


y^ c(x2k+2,y2k+2) 

k=0 

K-l 

c(x2,y2K) + C(x2k+2,y2k) 

k=l 

K-l 

y^ c(x2fc+l, J/2fc+l)- 


fc =0 


We conclude easily. 


□ 


We need now to work with 1-multijets of smooth functions from M x TV to R. For every even 
integer L = 2K > 4, we denote by Wl the set of tuples 


satisfying 


for all fee {0, iV — 1} and 


xi,yi),... (xL, yL)j , (Ai, ..., Al) , ... (pI.Pl))) 


i^i, yi) ^ {^ji Vj) 7^ J € {Ij • ■ •) L}, 


X2k+2 — X2k+1 

y2k+3 = y2k+2, 

K-l 


P2k+2 = P2k+1 

P 2 k +3 ~ P 2 k+ 2 ^ 


K-l 

y^ A2fc+i = A2fc+2. 

fc=0 k=0 

The set Wl is a submanifold of Jl{M x TV; R) of dimension 

D = 4Kn + L-l = {2n+l)L-l 

and J\{M x TV;R) has dimension (4n -I- 1)L. Thus Wl has codimension 2nL -\- 1. 

By the Multijet Transversality Theorem (see [15, Theorem 4.13 p. 57]), for each K = 2,3 
the set Ck of costs c for which is transverse to W 2 K is residual. The intersection 

C = Cl n {ri^= 2 CK) 

satishes the conclusions of Theorem 1.7. 
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A Generic uniqueness of optimal plans for fixed marginals 

Elaborating on a celebrated result by Mane [19] in the framework of Aubry-Mather theory, it is 
possible to prove that for fixed marginals the set of costs for which uniqueness of optimal transport 
plans holds is generic. Such a result was first obtained by Levin [18]. We include an argument here 
for comparison. 

Let M and N be smooth closed manifolds (meaning compact, without boundary) of dimension 
n > 1, c : M X N ^ H he a, cost function in C^{M x N-, R) with fc G N U {oo}, and /r, v be two 
Borel probability mesures, we recall that n(^, v) denotes the set of probability measures uv M x N 
with first and second marginals fj, and v. By the way, a measure on M x N is a continuous linear 
functional on the set of continuous functions C^{M x N-.R) and the set E = C^{M x A^;R)* of 
such measures is equipped with the topology of weak-* convergence saying that some sequence 
(tt;); in E converges to tt G E if and only if 


lim 

l—¥OC) 


fdiTi = 


fdir, 


MxN 


iMxN 


for every / G C^{M x A^; R). The following is classical. 

Lemma A.l. The set is a nonempty eompact convex set in E. 

The following will also be useful. We refer the reader to [15] for the definition of the C^-topology. 
Lemma A.2. The mapping 

(7r,c) G E X C''=(M X A;R) K—> / 


cdiT 


I MxN 


is continuous with respect to the weak-* topology on E and the -topology on C^{M x A;R). 
Moreover, for every 7ri,7r2 G n(/i, with tti ^ there is c & C^{M x A^;R) such that 


[ fdTTi^ [ /dTTa. 
JMxN JmxN 


For every c G C^{M x A; R) let Ai{c) be the set of optimal transport plans between g, and n, 
that is 

M{c) := In GU{p,,ix)\ f f dn < f / c?7r', Vtt' G n(/r, 1 . 

I JMxN JMxN J 

By construction, Al(c) is a nonempty compact convex subset of n(/r, n). 

Theorem A.3 (Levin). There exists a residual set C C C^{M x N; R) such that for every c G C, 
the set A4 (c) is a singleton. 

Here residual refers to a countable intersection of sets with dense interiors. Theorem A.3 
follows easily from results of Mane [19] (or from arguments developed subsequently by Bernard 
and Contreras [4]). For sake of completeness we provide its proof which is based (following the 
approach of Bernard and Contreras [4]) on the next lemma. It shows that near any given cost 
function can be found another for which the minimizing facet of K has arbitrarily small diameter. 
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Lemma A.4. The weak-* topology on K can he metrized by a distance d with the following property. 
Let Co € C^{M X A;R) he fixed. For every neighborhood U of Cq in C^{M x -/V;R) and every 
e > 0, there is c G U such that 

diam(^Ai{c)) < e. 

Proof of Lemma A. 4 .. Let U and e > 0 be fixed. By compactness of K := n(^, v) with respect to 
the weak-* topology, there is a sequence {/;}/eN of continuous functions that defines a metric d 
on n(/x, n) by 


d{Tri,TT 2 ) = ^ 


(=0 


2' 


fi diTi - 


fld7T2 


fMxN 


fMxN 


V7ri,7r2 e 


which is compatible with the weak topology. We claim that there is an integer I > 0 and 

Ci,...,q-G C'=(M X iV;R) 

such that the continuous map 


Pf : n(/x, n) 


R' 


defined by 


satisfies 


PM ■= i CidiT,..., Cfdn) Wir Gll{fx,iz), 

\JMxN JMxN / 

diam (P^^(p)) < e Vp G R^. 


(A.l) 


where the latter refers to the diameter with respect to d of the set of measures in n(/i, n) sent to 
p through Pj. For every c G C^{M x iV; R), let 


Wc:= \ { 1 ^ 1 , 112 ) \ / cd-Ki^ / cdM ■ 

I JMxN JMxN J 


By Lemma A.2, the sets Wc are open and their union covers the complement of the diagonal 
D = {(tTjTt) ITT G AT}. Since this complement is open in the metrizable set K x K, we can extract 
a countable subcovering from this covering. So there is a sequence {c;}/gN such that 


K X K\D= \JW,,. 


(A.2) 


We need to check that Pj satisfies (A.l) if I is large enough. If not, there are two sequences 
in K such that 

Pi(ttI) = Pi{Trf) and J(7r/,7rf)>e VI. 

Then up to taking subsequences, and converge respectively to some 7r^,7r^ G K with 

cl(7r^,7r^) > e. But by (A.2), there is m such that 

/ (-171 dlT ^ / Cjn dlT . 

JMxN JMxN 
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But we have 


PmiT^l) = f’m(7rf) VI > TTl, 

which passing to the limit gives Pm{'7T^) = a contradiction. 

Let K' := Pi{K) which is a nonempty convex compact set in R*, denote by : R* —)■ R the 
function defined by 


/ {ImXN (^0 dn \ TT € K s.t. Pi{n)=x'^ i 
[ +00 i 

and denote by $ : R* —)• R its conjugate, that is 
$(y) := sup_|(y,a:) - 5'(a:)| = m^|(2/,x) - ^'(x)| 


if X G K' 
if X i K' 


yx e R', 


= max • 

tt^K 


[ '^yiCidTT -y/{Pi{Tr)) 
JMxN 


for every y € RV By construction, $ is convex and finite on R*, moreover for every y € R^ and 
every a; € R* such that $(y) = {y^x) — d>(a;), we have 

^{y) + {y-y,x) = {y,x) - ^'(s) < d>(y) Vy e R'. 

This means that x belongs to the subdifferential of $ at y. If in addition tt G K satisfies 

P|-(7f) = X and 


fMxN 


I 

Co - '^yici j da < 


'MxN 


Co - E yici j cItt Vtt e iV, 

1=1 


then by definition of 4', we have 

/ ^^yi(^id'K T(x), Vtt € 

J MxN , J MxN , n 


Z=1 


This means that 


M 



C Pf 1 (a$(y)). 


By Rademacher’s theorem, for almost every y € R* the set 9‘f’(y) is a singleton. We conclude by 
(A.l). □ 

Let us now prove Theorem A.3. 

Proof of Theorem A.3. For every integer I > 0 denote by C; the set of c € C’^{M x N-, R) such that 

diam(A^(c)) < 
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By the continuity part in Lemma A.2, each set Ci is open and by Lemma A.4, it is dense as well. 
Then the set 

£■:= n* 

does the job. □ 
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